-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caddyhttp: Add MatchWithError
to replace SetVar hack
#6596
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly LGTM. But also, don't we want to remove the error returned in the context since it's no longer needed?
I kept the error in context for full backwards compat for places that still call Match. I think we need to do this in a few steps over the course of a few releases |
Does anyone really know about or use that though? Anyway, when you're ready to have this merged, go ahead and mark as ready. I'm OK with it if there's another commit to actually use the error return value gets into this PR, or we can do that later. |
I'll do some searches with SourceGraph just in case. |
79b1e5c
to
54a305c
Compare
Plugged the IP matcher hole, updated the tests to use |
1f22058
to
f6253a7
Compare
modules/caddyhttp/ip_matchers.go
Outdated
if r.TLS != nil && !r.TLS.HandshakeComplete { | ||
return false // if handshake is not finished, we infer 0-RTT that has not verified remote IP; could be spoofed | ||
return false, fmt.Errorf("TLS handshake not complete, remote IP cannot be verified") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mholt should this be a caddyhttp.Error
? If so, what HTTP status code should we use? I think as-is it would be a 500 without being otherwise specified but I'm not sure if that's correct. I don't know how it should behave on an incomplete handshake. Is there some 4xx series code which can instruct the client to try again or w/e?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some 4xx series code which can instruct the client to try again or w/e?
@francislavoie, I was able to work around the issue mentioned here #6427 (comment) with this in front of my ip matchers.
@early tls early_data
error @early 425
Before I have added this, I was able to reproduce my issue where the IP matcher "failed" (returned false for my IP address) after approx. 10 minutes of inactivtiy and then pressing F5 to refresh the page.
After have this added, I was no longer able to reproduce the issue.
Based on that experience, I would think the HTTP status code 425 is the desirable one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks, good to know! I'm away from home today, would you be willing to checkout this branch, change this line to a caddyhttp.Error
, build it and test it? Would be nice to confirm. 425 makes sense according to https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/425 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try it - just a bit difficult in that particular environment, because my 2.9.0 tests require the "promote-metrics" branch. But I have other hosts for testing, where I can disable metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good 👍
My change was this: matcher-with-error...steffenbusch:caddy:matcher-with-error
Before:
With 2.9.0-beta from "promote-metrics" branch, which means without your change, I got the undesired 404 because the IP-matcher was not matching with my IP address; followed by a success on a F5:
tail -f access.json | jq '{user_id, status, ts, tls: .request.tls, remote_ip: .request.remote_ip, uri: .request.uri}'
{
"user_id": "",
"status": 404,
"ts": 1728816499.5479946,
"tls": {
"resumed": true,
"version": "TLSv1.3",
"cipher_suite": "TLS_AES_128_GCM_SHA256",
"proto": "h3",
"server_name": "rambo.stbu.net"
},
"remote_ip": "95.89.172.31",
"uri": "/test/ip"
}
{
"user_id": "",
"status": 200,
"ts": 1728816503.3737376,
"tls": {
"resumed": true,
"version": "TLSv1.3",
"cipher_suite": "TLS_AES_128_GCM_SHA256",
"proto": "h3",
"server_name": "rambo.stbu.net"
},
"remote_ip": "95.89.172.31",
"uri": "/test/ip"
}
After:
Building it based on your branch with the httperror 425 tweak, it was working in my browser, and this was logged in access.json:
{
"user_id": "",
"status": 425,
"ts": 1728817672.261492,
"tls": {
"resumed": true,
"version": "TLSv1.3",
"cipher_suite": "TLS_AES_128_GCM_SHA256",
"proto": "h3",
"server_name": "rambo.stbu.net"
},
"remote_ip": "95.89.172.31",
"uri": "/test/ip-chrome"
}
{
"user_id": "",
"status": 200,
"ts": 1728817672.265926,
"tls": {
"resumed": true,
"version": "TLSv1.3",
"cipher_suite": "TLS_AES_128_GCM_SHA256",
"proto": "h3",
"server_name": "rambo.stbu.net"
},
"remote_ip": "95.89.172.31",
"uri": "/test/ip-chrome"
}
The 425
is logged, but I have not observed any issue in my Chrome Browser (maybe a little bit longer the "pending" than usually) and saw the result of the request immediately, which was logged at 200
instantly after the 425
.
However, I noticed a build-error for this branch related to github.com/mholt/caddy-ratelimit:
2024/10/13 11:27:46 [INFO] exec (timeout=0s): /usr/local/go/bin/go build -o /usr/local/bin/caddy -ldflags -w -s -trimpath -tags nobadger
# github.com/mholt/caddy-ratelimit
/root/go/pkg/mod/github.com/mholt/caddy-ratelimit@v0.0.0-20240828171918-12435ecef5db/handler.go:176:7: multiple-value rl.matcherSets.AnyMatch(r) (value of type (bool, error)) in single-value context
2024/10/13 11:29:15 [INFO] Cleaning up temporary folder: /tmp/buildenv_2024-10-13-1126.189145495
2024/10/13 11:29:15 [FATAL] exit status 1
Error: building at STEP "RUN go install github.com/caddyserver/xcaddy/cmd/xcaddy@latest && /usr/local/bin/xcaddy build --output /usr/local/bin/caddy --with github.com/caddyserver/caddy/v2=github.com/caddyserver/caddy/v2@matcher-with-error --with github.com/caddyserver/transform-encoder --with github.com/ueffel/caddy-tls-format --with github.com/muety/caddy-remote-host --with github.com/ggicci/caddy-jwt --with github.com/caddy-dns/ionos --with github.com/ueffel/caddy-brotli --with github.com/aksdb/caddy-cgi/v2 --with github.com/mholt/caddy-webdav --with github.com/mholt/caddy-ratelimit && /usr/local/bin/caddy version": while running runtime: exit status 1
For the above tests, I have removed the ratelimit plugin to get it compiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks! Will apply the fix later today. Good callout on AnyMatch with ratelimit, I assumed no plugins were using that so I made a more aggressive API change. I'll fix that up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay the AnyMatch
issue should be fixed now, and changed to use 425 status, if you want to confirm I did it right :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a caddyhttp.Error? If so, what HTTP status code should we use? I think as-is it would be a 500 without being otherwise specified but I'm not sure if that's correct. I don't know how it should behave on an incomplete handshake. Is there some 4xx series code which can instruct the client to try again or w/e?
I'm not really sure :( The problem is that neither the client nor the server have done anything particularly wrong... there is no actual error. We simply do not have a value that can be trusted.
You could argue it's a 4xx error since the client should not have done a 0-RTT, but does it know that? Probably not.
You could argue it's a 5xx error since the server can't correctly do what it was asked, but is that its fault? Not really.
I don't believe there's any HTTP status that can cause the TLS connection to be redone. 500 is the best all around option when you don't know what other status code to use. But I still don't like the idea of an HTTP error when no HTTP error has occurred 😕
It's possible (?) that a 426 Upgrade Required could cause the client to break the TLS connection and re-establish it if sent with a Connection: close
header instead of Connection: Upgrade
, but I have no idea honestly. Real-world testing would be required.
This is kinda why I didn't implement errors with matchers at first 🫠
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could argue it's a 4xx error since the client should not have done a 0-RTT, but does it know that? Probably not.
Per the RFC, 425 is the correct status, as mentioned earlier:
The server can cause a client to retry individual requests and
not use early data by responding with the 425 (Too Early) status
code (Section 5.2) in cases where the risk of replay is judged
too great.
There are some caveates such as
Note that a server cannot choose to selectively reject early data at
the TLS layer. TLS only permits a server to either accept all early
data or none of it. Once a server has decided to accept early data,
it MUST process all requests in early data, even if the server
rejects the request by sending a 425 (Too Early) response.
A server cannot make a request that contains the Early-Data header
field safe for processing by waiting for the handshake to complete.
A request that is marked with Early-Data was sent in early data on a
previous hop. Requests that contain the Early-Data header field and
cannot be safely processed MUST be rejected using the 425 (Too Early)
status code.
The RFC also says the clients must handle 425 properly by establishing a new connection to the server and retry the request.
@mholt I've changed enough in the past few commits that I think it warrants another review. |
60813a9
to
cf3acd2
Compare
cf3acd2
to
03667c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thoughtful work on this. I don't love that we have (?) to do this, but I guess that's how it is.
If we are going to make this an actual breaking change in the future, I'd rather just put an error return on the signature of Match
.
// An error during matching will abort the request middleware chain and | ||
// invoke the error middleware chain. | ||
// | ||
// This will eventually replace RequestMatcher. Matcher modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kinda bummed because if we did this I would want it to be Match(*http.Request) (bool, error)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I don't see how. It's like DialContext
etc. Something to redo in v3 lmao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but if we remove this interface, it'll be a breaking change anyway, so might as well break it into something that we want right? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can:
- Add
RequestMatcherWithError
(this) - Give plugins time to implement it, all plugins should stop using the interface guard for
RequestMatcher
- Later, change the signature of
RequestMatcher
toMatch(*http.Request) (bool, error)
(should be safe since plugins no longer use it) - Deprecate
RequestMatcherWithError
, preferringRequestMatcher
if implemented - Remove
RequestMatcherWithError
All this might take months or years but that would be the path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I'm thinking too. It'll kind of suck, but... might be worth it.
03667c6
to
73044e1
Compare
As recently discussed on Slack, our
RequestMatcher
interface is sometimes insufficient in situations where matchers cannot safely operate because it can only return a boolean. Since matchers make a decision whether or not handlers can run, returning false isn't a secure decision because it may cause handlers that must run (e.g. authentication) to not get run, causing possible bypass of security-critical functionality in the config.We already had a mechanism for this, which I added in #4346 via putting a variable in Context and reading it back after the matcher ran. That was absolutely a hack and not the best solution. I couldn't think of a better way at the time (lack of Go skills 😅)
To solve this, the better solution is to introduce a new
RequestMatcherWithError
interface which returns(bool, error)
, so that an error can be returned to cancel the HTTP request pipeline if the matcher's preconditions aren't met. For now, we can make this optional (so if a matcher implements this interface we call it if possible) until we know all matcher plugins have caught up implementing with this (though unfortunately guaranteeing that is quite difficult and a multi-step process to clean up).